Using Parallel Propbanks to enhance Word-alignments
نویسندگان
چکیده
This short paper describes the use of the linguistic annotation available in parallel PropBanks (Chinese and English) for the enhancement of automatically derived word alignments. Specifically, we suggest ways to refine and expand word alignments for verb-predicates by using predicate-argument structures. Evaluations demonstrate improved alignment accuracies that vary by corpus type.
منابع مشابه
Detecting Cross-lingual Semantic Similarity Using Parallel PropBanks
This paper suggests a method for detecting cross-lingual semantic similarity using parallel PropBanks. We begin by improving word alignments for verb predicates generated by GIZA++ by using information available in parallel PropBanks. We applied the Kuhn-Munkres method to measure predicateargument matching and improved verb predicate alignments by an F-score of 12.6%. Using the enhanced word al...
متن کاملEnhancing Phrase Extraction from Word Alignments Using Morphology
We propose a technique for effective extraction of bilingual phrases from word alignments using morphological processing. Morphological processing leads to an increase of the frequency of words in the corpus, consequently reduces Alignment Error Rate (AER). Intuitively, better word alignments enhance the quality of bilingual phrases extracted. Using alignments of a stemmed corpus for phrase ext...
متن کاملSentence and word alignment using Support Vector Machines
Sentence and word alignment are prerequisite tasks for any system concerning statistical machine translation. Although they seem very different, both sentence and word alignments require approximately the same features to discriminate between positive and negative examples of alignments. We present a solution that can align the sentences and the words of a parallel corpus using support vector m...
متن کاملTransferring Syntactic Relations from English to Hindi Using Alignments on Local Word Groups
Various works have used word alignments in parallel corpora to transfer information like POS tags, syntactic trees and word senses from source to target sentences. In this paper, we work on the problem of projecting syntactic relations from English to morphologically rich Hindi parallel text. We show the effectiveness of Local Word Groups (LWGs) in simplifying alignments as well as in transferr...
متن کاملBuilding a Hierarchically Aligned Chinese-English Parallel Treebank
We construct a hierarchically aligned Chinese-English parallel treebank by manually doing word alignments and phrase alignments simultaneously on parallel phrase-based parse trees. The main innovation of our approach is that we leave words without a translation counterpart (which are mostly language-particular function words) unaligned on the word level, and locate and align the appropriate phr...
متن کامل